Skip to content

Conversation

@openshift-cherrypick-robot

This is an automated cherry-pick of #5305

/assign pablintino

@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: Jira Issue OCPBUGS-62341 has been cloned as Jira Issue OCPBUGS-63126. Will retitle bug to link to clone.
/retitle [release-4.19] OCPBUGS-63126: Ensure the node passed to RunCordonOrUncordon comes from the latest updated state

In response to this:

This is an automated cherry-pick of #5305

/assign pablintino

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot changed the title [release-4.19] OCPBUGS-62341: Ensure the node passed to RunCordonOrUncordon comes from the latest updated state [release-4.19] OCPBUGS-63126: Ensure the node passed to RunCordonOrUncordon comes from the latest updated state Oct 15, 2025
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 15, 2025
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: This pull request references Jira Issue OCPBUGS-63126, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected dependent Jira Issue OCPBUGS-62341 to target a version in 4.20.0, but it targets "4.21.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

This is an automated cherry-pick of #5305

/assign pablintino

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sergiordlr
Copy link
Contributor

/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Oct 15, 2025
@sergiordlr
Copy link
Contributor

Verified using IPI on AWS

  1. Create a webhook that will make fail any attempt to change the .spec.unschedulable value in a node. It will make all cordon/uncordon operations fail

This is an example of a webhook failing all cordon/uncordon operations: https://github.com/sergiordlr/temp-testfiles/tree/master/webhook_example

  1. Apply a machineconfiguraion to make MCO cordon/uncordon the nodes to apply the config
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: test-machine-config-0
spec:
  config:
    ignition:
      version: 3.1.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,dGVzdA==
        path: /etc/test-file-0.test

  1. Check that the MCO controller cannot cordon the node and starts retrying
I1016 16:20:17.839772       1 drain_controller.go:193] node ip-10-0-22-244.us-east-2.compute.internal: cordoning
I1016 16:20:17.839814       1 drain_controller.go:193] node ip-10-0-22-244.us-east-2.compute.internal: initiating cordon (currently schedulable: true)
I1016 16:20:17.862667       1 drain_controller.go:581] cordon failed with: cordon error: admission webhook "unschedulable-webhook.default.svc" denied the request: Changing .spec.unschedulable on node is forbidden., retrying
I1016 16:20:27.863719       1 drain_controller.go:193] node ip-10-0-22-244.us-east-2.compute.internal: initiating cordon (currently schedulable: false)
I1016 16:20:27.866998       1 drain_controller.go:193] node ip-10-0-22-244.us-east-2.compute.internal: RunCordonOrUncordon() succeeded but node is still not in cordon state, retrying
  1. Remove the MutatingWebhookConfiguration created in step 1 to allow cordon/uncordon operations succeed again
  2. Check that the controller can now cordon the node and start applying the config
I1016 16:20:17.862667       1 drain_controller.go:581] cordon failed with: cordon error: admission webhook "unschedulable-webhook.default.svc" denied the request: Changing .spec.unschedulable on node is forbidden., retrying
I1016 16:20:27.863719       1 drain_controller.go:193] node ip-10-0-22-244.us-east-2.compute.internal: initiating cordon (currently schedulable: false)
I1016 16:20:27.866998       1 drain_controller.go:193] node ip-10-0-22-244.us-east-2.compute.internal: RunCordonOrUncordon() succeeded but node is still not in cordon state, retrying
I1016 16:20:47.868665       1 drain_controller.go:193] node ip-10-0-22-244.us-east-2.compute.internal: initiating cordon (currently schedulable: true)
I1016 16:20:47.892043       1 drain_controller.go:581] cordon failed with: cordon error: admission webhook "unschedulable-webhook.default.svc" denied the request: Changing .spec.unschedulable on node is forbidden., retrying
I1016 16:21:27.892227       1 drain_controller.go:193] node ip-10-0-22-244.us-east-2.compute.internal: initiating cordon (currently schedulable: false)
I1016 16:21:27.896462       1 drain_controller.go:193] node ip-10-0-22-244.us-east-2.compute.internal: RunCordonOrUncordon() succeeded but node is still not in cordon state, retrying
I1016 16:22:47.896662       1 drain_controller.go:193] node ip-10-0-22-244.us-east-2.compute.internal: initiating cordon (currently schedulable: true)
I1016 16:22:47.906316       1 node_controller.go:606] Pool worker[zone=us-east-2a]: node ip-10-0-22-244.us-east-2.compute.internal: Reporting unready: node ip-10-0-22-244.us-east-2.compute.internal is reporting Unschedulable
I1016 16:22:47.911764       1 drain_controller.go:193] node ip-10-0-22-244.us-east-2.compute.internal: cordon succeeded (currently schedulable: false)
I1016 16:22:47.927238       1 node_controller.go:606] Pool worker[zone=us-east-2a]: node ip-10-0-22-244.us-east-2.compute.internal: changed taints
I1016 16:22:47.935186       1 drain_controller.go:193] node ip-10-0-22-244.us-east-2.compute.internal: initiating drain
E1016 16:22:48.970970       1 drain_controller.go:163] WARNING: ignoring DaemonSet-managed Pods: openshift-cluster-csi-drivers/aws-ebs-csi-driver-node-8jj5j, openshift-cluster-node-tuning-operator/tuned-2zg9m, openshift-dns/dns-default-jxd5t, openshift-dns/node-resolver-fkpj7, openshift-image-registry/node-ca-8snhd, openshift-ingress-canary/ingress-canary-vq97k, openshift-insights/insights-runtime-extractor-qswmb, openshift-machine-config-operator/machine-config-daemon-rsbl9, openshift-monitoring/node-exporter-46wkm, openshift-multus/multus-4fkrb, openshift-multus/multus-additional-cni-plugins-8458v, openshift-multus/network-metrics-daemon-rjsv5, openshift-network-diagnostics/network-check-target-4bwk4, openshift-network-operator/iptables-alerter-p6bqr, openshift-ovn-kubernetes/ovnkube-node-sfhnx
  1. The configuration is properly applied in all nodes

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Oct 16, 2025
@pablintino
Copy link
Contributor

/retest-required
/lgtm
/verified by @sergiordlr

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Oct 22, 2025
@openshift-ci-robot
Copy link
Contributor

@pablintino: This PR has been marked as verified by @sergiordlr.

In response to this:

/retest-required
/lgtm
/verified by @sergiordlr

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 22, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 22, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: openshift-cherrypick-robot, pablintino

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 22, 2025
@pablintino
Copy link
Contributor

/jira refresh

@openshift-ci-robot
Copy link
Contributor

@pablintino: This pull request references Jira Issue OCPBUGS-63126, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected dependent Jira Issue OCPBUGS-63127 to be in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but it is MODIFIED instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@pablintino
Copy link
Contributor

/jira refresh
/retest-required

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 24, 2025
@openshift-ci-robot
Copy link
Contributor

@pablintino: This pull request references Jira Issue OCPBUGS-63126, which is valid. The bug has been moved to the POST state.

7 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.19.z) matches configured target version for branch (4.19.z)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)
  • release note type set to "Release Note Not Required"
  • dependent bug Jira Issue OCPBUGS-63127 is in the state Verified, which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA))
  • dependent Jira Issue OCPBUGS-63127 targets the "4.20.0" version, which is one of the valid target versions: 4.20.0, 4.20.z
  • bug has dependents

Requesting review from QA contact:
/cc @sergiordlr

In response to this:

/jira refresh
/retest-required

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from sergiordlr October 24, 2025 12:18
Copy link
Member

@isabella-janssen isabella-janssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/label backport-risk-assessed

Looks like a clean backport that has been approved and verified by the team.

@openshift-ci openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Oct 24, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 24, 2025

@openshift-cherrypick-robot: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 999097f into openshift:release-4.19 Oct 24, 2025
15 checks passed
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: Jira Issue Verification Checks: Jira Issue OCPBUGS-63126
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-63126 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

In response to this:

This is an automated cherry-pick of #5305

/assign pablintino

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.